This script performs preprocessing and quality control (QC) filtering
of single-cell RNA sequencing (scRNA-seq) data generated using
conventional 10x Genomics, FLEX, and Parse Biosciences Mini and Midi
kits. The R package CRMetrics is utilized for QC in this script. While
the published version of CRMetrics (https://github.com/khodosevichlab/CRMetrics) is only
compatible with conventional 10x Genomics data, a local adaptation has
been implemented to support 10x FLEX and Parse Biosciences data.
Additionally, this version incorporates compatibility with the latest
CellBender release (version 0.3.0). For large datasets (~ 100,000
nuclei) memory limitations of R (e.g. with Conos and CRMetrics)
necessitate transitioning to Python for data analysis. More information
and application of the CRMetrics package can be found here: http://kkh.bric.ku.dk/rasmus/CRMetrics/walkthrough.html
QC filtering involves iterative adjustments to determine the optimal
thresholds for filtering depth (unique molecular identifiers [UMIs] per
cell), mitochondrial (mito) fraction (percentage of mitochondrial gene
expression per cell), and doublets (two or more cells clustered
together). Final clustering is not conducted in this script, it is
addressed in a subsequent script, “Clustering_and_annotation”.
If ambient RNA filtering is applied, it is performed first, followed
by filtering based on depth, mitochondrial fraction, and doublets. At
the end of the pipeline, a new CRMetrics object is constructed using the
filtered count matrices (=cms) to plot the violin plots for UMI and gene
counts post-filtering.
Due to its computational intensity, ambient RNA filtering is not part
of the standard analysis pipeline but is performed upon specific user
request.
Setup
library(devtools)
# loading locally adapted CRMetrics version
#load_all("/people/nrq364/CRMetrics_LW")
library(CRMetrics)
library(magrittr)
library(dplyr)
library(conos)
library(pagoda2)
library(qs)
1. fastQC and multiQC
Provide links to multiQC reports of raw sequencing data and
cellranger metrics. Does not yet work on cellranger multi (FLEX) and
Parse output.
2. Create CRMetrics object
In case of Parse data use data.path to “combined” folder, for 10x
FLEX data.path to “per_sample_outs” folder. Metadata should at least
contain “sample” and “group/condition” but can have more columns.
crm <- CRMetrics$new(data.path = "/data/counts/[project_name]/",
metadata = "path_to_metadata.csv",
n.cores = 60)
crm$metadata
qsave(crm, "crm.qs", nthreads = 10)
3. Basic metrics
Show available metrics to plot.
crm$selectMetrics()
Plot selected metrics per sample
e.g. cells per sample, median genes, median UMIs, mean reads
metrics.to.plot <- crm$selectMetrics(ids = c(1:3,5,6))
crm$plotSummaryMetrics(plot.geom = "bar", comp.group = "sample", metrics = metrics.to.plot)

Plot selected metrics per condition
We can do the same, but set the comparison group to condition. This
will add statistics to the plots. Additionaly, we can add a second
comparison group for coloring.
crm$plotSummaryMetrics(comp.group = "condition",
metrics = metrics.to.plot,
plot.geom = "point",
stat.test = "non-parametric") #second.comp.group= "sex"

Add and pot detailed metrices
crm$addCms(add.metadata=F) #for Parse data: parse = T, for FLEX: flex = T
crm$addDetailedMetrics(n.cores = 30)
get total number of sequenced nuclei
sum(sapply(crm$cms, function(d) dim(d)[2]))
[1] 61975
plot UMIs and genes per sample before filtering, horizontal lines
indicate the median values for all samples
crm$plotDetailedMetrics(comp.group = "condition",
metrics = c("UMI_count", "gene_count"),
plot.geom = "violin")

Generate a data frame with the metrics.
metrics <- crm$summary.metrics
metrics %<>% as.data.frame()
metrics
Get median genes per cell across all samples before filtering (in
case of Parse: “hg38_median_genes_per_cell”).
median(metrics[metrics$metric=="median genes per cell",]$value)
[1] 3478
Get median UMIs per cell across all samples before filtering (in case
of Parse: “hg38_median_tscp_per_cell”).
median(metrics[metrics$metric=="median umi counts per cell",]$value)
[1] 7945
If ambient RNA removal is part of the pipeline, it is performed here.
Relevant code is located further down in the script under 8.
4. Embed cells using conos
To plot the cells in an embedding, we need to preprocess the raw
count matrices with pagoda2 (seurat is also possible).
crm$doPreprocessing() #min.transcripts.per.cell = 100
Create embedding using conos. This functions executes the functions
Conos$new(cms, n.cores = n.cores), buildGraph(), findCommunities(),
embedGraph(method = “UMAP”), with default settings, which can be
adjusted.
crm$createEmbedding()
Plot the embedding. This is before filtering, so we are not yet
interested in the clusters, those are refined in the next script.
crm$plotEmbedding()

5. Filtering out low quality cells
Cell depth
Plot cell depth in the embedding or as histograms per sample.
crm$plotEmbedding(depth = TRUE,
depth.cutoff = 1000)

crm$plotDepth()

If the depth distribution varies between samples, we can use
sample-specific cutoffs.
depth_cutoff_vec <- c(1e3, 1e3, 500, 1e3, 1e3, 500, 1e3, 1e3, 1e3, 1e3, 1e3, 500) %>%
setNames(crm$detailed.metrics$sample %>%
unique() %>%
sort())
depth_cutoff_vec
control_1 control_2 control_4 control_5 control_6 control_7 treated_1 treated_2 treated_3 treated_4
1000 1000 500 1000 1000 500 1000 1000 1000 1000
treated_6 treated_7
1000 500
Plot sample specific cutoff in histogram and embedding.
crm$plotDepth(cutoff = depth_cutoff_vec)

crm$plotEmbedding(depth = TRUE,
depth.cutoff = depth_cutoff_vec)

Mitochondrial gene fraction
Investigate the mitochondrial gene fraction in the cells. The example
dataset has a very low expression of mitochondrial genes. Usual cutoff
is 5% or higher.
crm$plotEmbedding(mito.frac = TRUE,
mito.cutoff = 0.01,
species = "mouse")

We can plot the distribution of the mitochondrial fraction per sample
and also include sample-wise cutoffs (not shown here as the example
dataset has only a very low expression of mitochondrial genes).
Doublet detection
Doublet detection within CRMetrics is possible with the python
packages scrublet or DoubletDetection Scrublet is the default method,
which is fast. DoubletDetection is significantly slower, but performs
better according to this review (https://www.sciencedirect.com/science/article/pii/S2405471220304592).
Whcih one to choose depends on the dataset and technology used.
Here we use scrublet as default method. A virtual environment is used
to install scrublet.
library(reticulate)
conda_create("scrublet",
conda = "/opt/software/miniconda/4.12.0/condabin/conda",
python_version = 3.8)
Run scrublet from within R or generate a Python script to run it from
the terminal. To do this, set export = TRUE and run python scrublet.py
inside the virtual environment to generate data.
crm$detectDoublets(env = "scrublet",
conda.path = "/opt/software/miniconda/4.12.0/condabin/conda",
method = "scrublet")
Then, load data with:
crm$addDoublets()
Plot the estimated doublets in the embedding.
crm$plotEmbedding(doublet.method = "scrublet")

Plot filtered cells
Plot all cells to be filtered on the embedding.
crm$plotFilteredCells(type = "embedding",
depth = TRUE,
depth.cutoff = 1000,
doublet.method = "scrublet",
mito.frac = FALSE,
species = "mouse")
Scale for colour is already present.
Adding another scale for colour, which will replace the existing scale.

Plot the cells to be filtered per sample in a bar plot where
combination means a cell that has at least two filter labels,
e.g. doublet and depth.
crm$plotFilteredCells(type = "bar",
depth = TRUE,
depth.cutoff = 1000,
doublet.method = "scrublet",
mito.frac = FALSE,
species = "mouse")

6. Filter and save count matrices
Finally, filter the count matrices to create a cleaned list to be
used in downstream applications. If cellbender should be used then cms
needs to be added after cellbender and then all subsequent filtering is
done.
crm$filterCms(depth.cutoff = 1000,
mito.cutoff = NULL,
doublets = "scrublet",
samples.to.exclude = NULL,
species = "mouse",
verbose = TRUE)
The filtered list of count matrices is stored in $cms.filtered which
can be saved on disk.
qsave(crm$cms.filtered, "cms_filtered[_cellbender].qs",
nthreads = 10)
7. Plot detailed metrics after filtering
cms <- qread("cms_filtered[_cellbender].qs")
Create a new crm object with cms after filtering.
crm_filtered <- CRMetrics$new(cms = crm$cms.filtered, n.cores = 60, metadata="path_to_metadata.csv")
crm_filtered$addSummaryFromCms()
crm_filtered$addDetailedMetrics(n.cores = 30)
Total number of nuclei retained after QC:
sum(sapply(crm_filtered$cms, function(d) dim(d)[2]))
Plot UMIs and genes per sample after QC filtering, horizontal lines
indicate the median values for all samples.
crm_filtered$plotDetailedMetrics(comp.group = "condition",
metrics = c("UMI_count", "gene_count"),
plot.geom = "violin")
Generate a data frame with the metrics.
metrics <- crm_filtered$summary.metrics
metrics %<>% as.data.frame()
metrics
Median genes per cell across all samples after QC filtering (in case
of Parse: “hg38_median_genes_per_cell”).
median(metrics[metrics$metric=="median genes per cell",]$value)
Median UMIs per cell across all samples after QC filtering (in cas of
Parse: “hg38_median_tscp_per_cell”).
median(metrics[metrics$metric=="median umi counts per cell",]$value)
8. Optional: remove ambient RNA
We use usually cellbender, SoupX is also included in CRMetrics.
Perform this step before filtering, it changes the number of called
cells and their gene counts.
In the old cellbender version (before 0.3.0) we needed to specify
expected number of cells and total droplets included. As hinted in the
manual, the number of total droplets included could be expected number
of cells multiplied by 3, this can still be plotted here when setting
show.total.droplets=T. Additionally, in case of Parse, the following
function creates the input files for cellbender from the Parse
output.
crm$prepareCellbender(show.expected.cells = T, show.total.droplets = T)
2025-04-30 10:32:28.050553 Started run using 12 cores
2025-04-30 10:32:28.057862 Using stored HDF5 Cell Ranger/Parse outputs. To overwrite, set $cms.raw <- NULL
2025-04-30 10:32:28.058687 Using stored UMI counts calculations. To overwrite, set $cellbender$umi.counts <- NULL
2025-04-30 10:32:28.059399 Plotting
2025-04-30 10:32:28.885673 Done!

The following function prepares and saves the cellbender script. To
only select specific samples, use the samples parameter. total.droplets
and expected.cells is a logical because in cellbender from v0.3.0
total.droplets and expected.cells do not need to be specified anymore.
○ FALSE is default
○ TRUE would take expected.cells from
metadata and total droplets would be 3x expected cells
○
alternatively give own numbers (named vector)
crm$saveCellbenderScript(file = "cellbender_script.sh",
fpr = 0.01,
epochs = 150,
use.gpu = TRUE, total.droplets = F, expected.cells = F) # could be total.droplets = droplets (named vector)
We can run this script in the terminal (screen session). Here, we
activate the environment: conda activate cellbender_v0.3.0 and we run
the bash script: sh cellbender_script.sh.
Plot cellbender results
Plot the changes in cell numbers following CellBender
estimations.
crm$plotCbCells()

Plot the CellBender training results.
crm$plotCbTraining()

Plot the cell probabilities.
crm$plotCbCellProbs()

Plot the identified ambient genes per sample.
crm$plotCbAmbExp(cutoff = 0.005)

Lastly, plot the proportion of samples expressing ambient genes.
MALAT1 is identified as an ambient gene in almost all samples which is
expected.
crm$plotCbAmbGenes(cutoff = 0.005)

Add cellbender filtered cms
Then, we can add the cellbender filtered cms to our object.
Additionally, it is recommended to generate new summary metrics from the
adjusted cms which can then be plotted.
crm$cms <- NULL
crm$addCms(cellbender = T)
crm$addSummaryFromCms()
crm$detailed.metrics <- NULL
crm$addDetailedMetrics()
qsave(crm, "crm_cellbender.qs")
---
title: "1) Preprocessing and QC filtering"
date: "`r Sys.Date()`"
output:
  html_notebook:
    toc: true
    toc_float: true
    code_folding: none
---

This script performs preprocessing and quality control (QC) filtering of single-cell RNA sequencing (scRNA-seq) data generated using conventional 10x Genomics, FLEX, and Parse Biosciences Mini and Midi kits.
The R package CRMetrics is utilized for QC in this script. While the published version of CRMetrics (https://github.com/khodosevichlab/CRMetrics) is only compatible with conventional 10x Genomics data, a local adaptation has been implemented to support 10x FLEX and Parse Biosciences data. Additionally, this version incorporates compatibility with the latest CellBender release (version 0.3.0). For large datasets (~ 100,000 nuclei) memory limitations of R (e.g. with Conos and CRMetrics) necessitate transitioning to Python for data analysis. More information and application of the CRMetrics package can be found here: http://kkh.bric.ku.dk/rasmus/CRMetrics/walkthrough.html

QC filtering involves iterative adjustments to determine the optimal thresholds for filtering depth (unique molecular identifiers [UMIs] per cell), mitochondrial (mito) fraction (percentage of mitochondrial gene expression per cell), and doublets (two or more cells clustered together). Final clustering is not conducted in this script, it is addressed in a subsequent script, "Clustering_and_annotation".

If ambient RNA filtering is applied, it is performed first, followed by filtering based on depth, mitochondrial fraction, and doublets. At the end of the pipeline, a new CRMetrics object is constructed using the filtered count matrices (=cms) to plot the violin plots for UMI and gene counts post-filtering.

Due to its computational intensity, ambient RNA filtering is not part of the standard analysis pipeline but is performed upon specific user request.


# Setup
```{r, message=F}
library(devtools)
# loading locally adapted CRMetrics version
#load_all("/people/nrq364/CRMetrics_LW")
library(CRMetrics)
library(magrittr)
library(dplyr)
library(conos)
library(pagoda2)
library(qs)
```
# 1. fastQC and multiQC
Provide links to multiQC reports of raw sequencing data and cellranger metrics. Does not yet work on cellranger multi (FLEX) and Parse output.

# 2. Create CRMetrics object
In case of Parse data use data.path to "combined" folder, for 10x FLEX data.path to "per_sample_outs" folder. 
Metadata should at least contain "sample" and "group/condition" but can have more columns.
```{r}
crm <- CRMetrics$new(data.path = "/data/counts/[project_name]/", 
                     metadata = "path_to_metadata.csv", 
                     n.cores = 60)
```

```{r}
crm$metadata
```

```{r}
qsave(crm, "crm.qs", nthreads = 10)
```

# 3. Basic metrics

Show available metrics to plot.
```{r}
crm$selectMetrics()
```

## Plot selected metrics per sample
e.g. cells per sample, median genes, median UMIs, mean reads
```{r, fig.height=6, warning=F}
metrics.to.plot <- crm$selectMetrics(ids = c(1:3,5,6))
crm$plotSummaryMetrics(plot.geom = "bar", comp.group = "sample", metrics = metrics.to.plot)
```

## Plot selected metrics per condition
We can do the same, but set the comparison group to condition. This will add statistics to the plots. Additionaly, we can add a second comparison group for coloring.
```{r, fig.height=6, warning=F}
crm$plotSummaryMetrics(comp.group = "condition",
                       metrics = metrics.to.plot, 
                       plot.geom = "point", 
                       stat.test = "non-parametric") #second.comp.group= "sex"
```

## Add and pot detailed metrices
```{r}
crm$addCms(add.metadata=F) #for Parse data: parse = T, for FLEX: flex = T
crm$addDetailedMetrics(n.cores = 30)
```

get total number of sequenced nuclei
```{r}
sum(sapply(crm$cms, function(d) dim(d)[2]))
```

plot UMIs and genes per sample before filtering, horizontal lines indicate the median values for all samples
```{r, warning=F}
crm$plotDetailedMetrics(comp.group = "condition",
                        metrics = c("UMI_count",  "gene_count"), 
                        plot.geom = "violin")
```
Generate a data frame with the metrics.
```{r}
metrics <- crm$summary.metrics
metrics %<>% as.data.frame()
metrics
```
Get median genes per cell across all samples before filtering
(in case of Parse: "hg38_median_genes_per_cell").
```{r}
median(metrics[metrics$metric=="median genes per cell",]$value)
```

Get median UMIs per cell across all samples before filtering
(in case of Parse: "hg38_median_tscp_per_cell").
```{r}
median(metrics[metrics$metric=="median umi counts per cell",]$value)
```

If ambient RNA removal is part of the pipeline, it is performed here. Relevant code is located further down in the script under 8.

# 4. Embed cells using conos 
To plot the cells in an embedding, we need to preprocess the raw count matrices with pagoda2 (seurat is also possible).
```{r}
crm$doPreprocessing() #min.transcripts.per.cell = 100
```

Create embedding using conos. This functions executes the functions Conos$new(cms, n.cores = n.cores), buildGraph(), findCommunities(), embedGraph(method = "UMAP"), with default settings, which can be adjusted. 
```{r}
crm$createEmbedding()
```

Plot the embedding. This is before filtering, so we are not yet interested in the clusters, those are refined in the next script.
```{r}
crm$plotEmbedding()
```

# 5. Filtering out low quality cells
## Cell depth
Plot cell depth in the embedding or as histograms per sample.
```{r}
crm$plotEmbedding(depth = TRUE, 
             depth.cutoff = 1000)
```
```{r, fig.height=6, warning=F}
crm$plotDepth()
```

If the depth distribution varies between samples, we can use sample-specific cutoffs.
```{r}
depth_cutoff_vec <- c(1e3, 1e3, 500, 1e3, 1e3, 500, 1e3, 1e3, 1e3, 1e3, 1e3, 500) %>% 
  setNames(crm$detailed.metrics$sample %>% 
             unique() %>% 
             sort())

depth_cutoff_vec
```

Plot sample specific cutoff in histogram and embedding.
```{r, warning=F, fig.height=6}
crm$plotDepth(cutoff = depth_cutoff_vec)
```

```{r}
crm$plotEmbedding(depth = TRUE, 
             depth.cutoff = depth_cutoff_vec)
```

## Mitochondrial gene fraction
Investigate the mitochondrial gene fraction in the cells. The example dataset has a very low expression of mitochondrial genes. Usual cutoff is 5% or higher.
```{r}
crm$plotEmbedding(mito.frac = TRUE, 
             mito.cutoff = 0.01, 
             species = "mouse")
```

We can plot the distribution of the mitochondrial fraction per sample and also include sample-wise cutoffs (not shown here as the example dataset has only a very low expression of mitochondrial genes).
```{r, echo=TRUE, results='hide', fig.height=6, warning=F}
crm$plotMitoFraction(cutoff = 0.01)
```

## Doublet detection
Doublet detection within CRMetrics is possible with the python packages scrublet or DoubletDetection
Scrublet is the default method, which is fast. DoubletDetection is significantly slower, but performs better according to this review (https://www.sciencedirect.com/science/article/pii/S2405471220304592). Whcih one to choose depends on the dataset and technology used.

Here we use scrublet as default method.
A virtual environment is used to install scrublet.
```{r}
library(reticulate)
conda_create("scrublet", 
             conda = "/opt/software/miniconda/4.12.0/condabin/conda", 
             python_version = 3.8)
```


Run scrublet from within R or generate a Python script to run it from the terminal. To do this, set export = TRUE and run python scrublet.py inside the virtual environment to generate data.
```{r}
crm$detectDoublets(env = "scrublet",
                   conda.path = "/opt/software/miniconda/4.12.0/condabin/conda",
                   method = "scrublet")
```

Then, load data with:
```{r}
crm$addDoublets()
```

Plot the estimated doublets in the embedding.
```{r}
crm$plotEmbedding(doublet.method = "scrublet")
```


## Plot filtered cells
Plot all cells to be filtered on the embedding.
```{r}
crm$plotFilteredCells(type = "embedding", 
                      depth = TRUE, 
                      depth.cutoff = 1000, 
                      doublet.method = "scrublet", 
                      mito.frac = FALSE, 
                      species = "mouse")
```

Plot the cells to be filtered per sample in a bar plot where combination means a cell that has at least two filter labels, e.g. doublet and depth.
```{r, warning=F}
crm$plotFilteredCells(type = "bar", 
                      depth = TRUE, 
                      depth.cutoff = 1000, 
                      doublet.method = "scrublet", 
                      mito.frac = FALSE, 
                      species = "mouse")
```


# 6. Filter and save count matrices
Finally, filter the count matrices to create a cleaned list to be used in downstream applications.
If cellbender should be used then cms needs to be added after cellbender and then all subsequent filtering is done.
```{r}
crm$filterCms(depth.cutoff = 1000, 
              mito.cutoff = NULL, 
              doublets = "scrublet",
              samples.to.exclude = NULL,
              species = "mouse",
              verbose = TRUE)
```

The filtered list of count matrices is stored in $cms.filtered which can be saved on disk.
```{r}
qsave(crm$cms.filtered, "cms_filtered[_cellbender].qs", 
      nthreads = 10)
```

# 7. Plot detailed metrics after filtering
```{r}
cms <- qread("cms_filtered[_cellbender].qs")
```

Create a new crm object with cms after filtering.
```{r}
crm_filtered <- CRMetrics$new(cms = crm$cms.filtered, n.cores = 60, metadata="path_to_metadata.csv")
crm_filtered$addSummaryFromCms()
crm_filtered$addDetailedMetrics(n.cores = 30)
```

Total number of nuclei retained after QC:
```{r}
sum(sapply(crm_filtered$cms, function(d) dim(d)[2]))
```

Plot UMIs and genes per sample after QC filtering, horizontal lines indicate the median values for all samples.
```{r}
crm_filtered$plotDetailedMetrics(comp.group = "condition",
                        metrics = c("UMI_count",  "gene_count"), 
                        plot.geom = "violin")
```
Generate a data frame with the metrics.
```{r}
metrics <- crm_filtered$summary.metrics
metrics %<>% as.data.frame()
metrics
```
Median genes per cell across all samples after QC filtering (in case of Parse: "hg38_median_genes_per_cell").
```{r}
median(metrics[metrics$metric=="median genes per cell",]$value)
```

Median UMIs per cell across all samples after QC filtering (in cas of Parse: "hg38_median_tscp_per_cell").
```{r}
median(metrics[metrics$metric=="median umi counts per cell",]$value)
```


# 8. Optional: remove ambient RNA
```{r, include=F}
crm <- qread("crm_cellbender_without_ctrl3_and_treated5.qs", nthreads=10)
```

We use usually cellbender, SoupX is also included in CRMetrics.
Perform this step before filtering, it changes the number of called cells and their gene counts.

In the old cellbender version (before 0.3.0) we needed to specify expected number of cells and total droplets included. As hinted in the manual, the number of total droplets included could be expected number of cells multiplied by 3, this can still be plotted here when setting show.total.droplets=T.
Additionally, in case of Parse, the following function creates the input files for cellbender from the Parse output. 
```{r, fig.height=6, warning=F}
crm$prepareCellbender(show.expected.cells = T, show.total.droplets = T)
```


The following function prepares and saves the cellbender script. To only select specific samples, use the samples parameter. 
total.droplets and expected.cells is a logical because in cellbender from v0.3.0 total.droplets and expected.cells do not need to be specified anymore. <br>
		○ FALSE is default <br>
		○ TRUE would take expected.cells from metadata and total droplets would be 3x expected cells <br>
		○ alternatively give own numbers (named vector) 
```{r}
crm$saveCellbenderScript(file = "cellbender_script.sh", 
                         fpr = 0.01, 
                         epochs = 150, 
                         use.gpu = TRUE, total.droplets = F, expected.cells = F) # could be total.droplets = droplets (named vector)
```

We can run this script in the terminal (screen session). Here, we activate the environment: conda activate cellbender_v0.3.0 and we run the bash script: sh cellbender_script.sh.

## Plot cellbender results
Plot the changes in cell numbers following CellBender estimations.
```{r, warning=F}
crm$plotCbCells()
```
```{r, eval=F, include=F}
devtools::load_all("/people/nrq364/CRMetrics_LW")
crm %<>% CRMetrics$new()
```

Plot the CellBender training results.
```{r, fig.height=6, warning=F}
crm$plotCbTraining()
```

Plot the cell probabilities.
```{r, fig.height=6, warning=F}
crm$plotCbCellProbs()
```

Plot the identified ambient genes per sample.
```{r, fig.height=6}
crm$plotCbAmbExp(cutoff = 0.005)
```

Lastly, plot the proportion of samples expressing ambient genes. MALAT1 is identified as an ambient gene in almost all samples which is expected.
```{r, warning=F}
crm$plotCbAmbGenes(cutoff = 0.005)
```

## Add cellbender filtered cms 
Then, we can add the cellbender filtered cms to our object. Additionally, it is recommended to generate new summary metrics from the adjusted cms which can then be plotted.
```{r}
crm$cms <- NULL
crm$addCms(cellbender = T)
crm$addSummaryFromCms()
crm$detailed.metrics <- NULL
crm$addDetailedMetrics()
qsave(crm, "crm_cellbender.qs")
```
